Goto

Collaborating Authors

 Howard County


A new initialisation to Control Gradients in Sinusoidal Neural network

Combette, Andrea, Venaille, Antoine, Pustelnik, Nelly

arXiv.org Artificial Intelligence

Proper initialisation strategy is of primary importance to mitigate gradient explosion or vanishing when training neural networks. Yet, the impact of initialisation parameters still lacks a precise theoretical understanding for several well-established architectures. Here, we propose a new initialisation for networks with sinusoidal activation functions such as \texttt{SIREN}, focusing on gradients control, their scaling with network depth, their impact on training and on generalization. To achieve this, we identify a closed-form expression for the initialisation of the parameters, differing from the original \texttt{SIREN} scheme. This expression is derived from fixed points obtained through the convergence of pre-activation distribution and the variance of Jacobian sequences. Controlling both gradients and targeting vanishing pre-activation helps preventing the emergence of inappropriate frequencies during estimation, thereby improving generalization. We further show that this initialisation strongly influences training dynamics through the Neural Tangent Kernel framework (NTK). Finally, we benchmark \texttt{SIREN} with the proposed initialisation against the original scheme and other baselines on function fitting and image reconstruction. The new initialisation consistently outperforms state-of-the-art methods across a wide range of reconstruction tasks, including those involving physics-informed neural networks.


SurveyGen: Quality-Aware Scientific Survey Generation with Large Language Models

Bao, Tong, Nayeem, Mir Tafseer, Rafiei, Davood, Zhang, Chengzhi

arXiv.org Artificial Intelligence

Automatic survey generation has emerged as a key task in scientific document processing. While large language models (LLMs) have shown promise in generating survey texts, the lack of standardized evaluation datasets critically hampers rigorous assessment of their performance against human-written surveys. In this work, we present SurveyGen, a large-scale dataset comprising over 4,200 human-written surveys across diverse scientific domains, along with 242,143 cited references and extensive quality-related metadata for both the surveys and the cited papers. Leveraging this resource, we build QUAL-SG, a novel quality-aware framework for survey generation that enhances the standard Retrieval-Augmented Generation (RAG) pipeline by incorporating quality-aware indicators into literature retrieval to assess and select higher-quality source papers. Using this dataset and framework, we systematically evaluate state-of-the-art LLMs under varying levels of human involvement - from fully automatic generation to human-guided writing. Experimental results and human evaluations show that while semi-automatic pipelines can achieve partially competitive outcomes, fully automatic survey generation still suffers from low citation quality and limited critical analysis.


WavePulse: Real-time Content Analytics of Radio Livestreams

Mittal, Govind, Gupta, Sarthak, Wagle, Shruti, Chopra, Chirag, DeMattee, Anthony J, Memon, Nasir, Ahamad, Mustaque, Hegde, Chinmay

arXiv.org Artificial Intelligence

Radio remains a pervasive medium for mass information dissemination, with AM/FM stations reaching more Americans than either smartphone-based social networking or live television. Increasingly, radio broadcasts are also streamed online and accessed over the Internet. We present WavePulse, a framework that records, documents, and analyzes radio content in real-time. While our framework is generally applicable, we showcase the efficacy of WavePulse in a collaborative project with a team of political scientists focusing on the 2024 Presidential Elections. We use WavePulse to monitor livestreams of 396 news radio stations over a period of three months, processing close to 500,000 hours of audio streams. These streams were converted into time-stamped, diarized transcripts and analyzed to track answer key political science questions at both the national and state levels. Our analysis revealed how local issues interacted with national trends, providing insights into information flow. Our results demonstrate WavePulse's efficacy in capturing and analyzing content from radio livestreams sourced from the Web. Code and dataset can be accessed at \url{https://wave-pulse.io}.


Continuous Field Reconstruction from Sparse Observations with Implicit Neural Networks

Luo, Xihaier, Xu, Wei, Ren, Yihui, Yoo, Shinjae, Nadiga, Balu

arXiv.org Artificial Intelligence

Reliably reconstructing physical fields from sparse sensor data is a challenge that frequently arises in many scientific domains. In practice, the process generating the data often is not understood to sufficient accuracy. Therefore, there is a growing interest in using the deep neural network route to address the problem. This work presents a novel approach that learns a continuous representation of the physical field using implicit neural representations (INRs). Specifically, after factorizing spatiotemporal variability into spatial and temporal components using the separation of variables technique, the method learns relevant basis functions from sparsely sampled irregular data points to develop a continuous representation of the data. In experimental evaluations, the proposed model outperforms recent INR methods, offering superior reconstruction quality on simulation data from a stateof-the-art climate model and a second dataset that comprises ultra-high resolution satellite-based sea surface temperature fields. Achieving accurate and comprehensive representation of complex physical fields is pivotal for tasks spanning system monitoring and control, analysis, and design. However, in a multitude of applications, encompassing geophysics (Reichstein et al., 2019), astronomy (Gabbard et al., 2022), biochemistry (Zhong et al., 2021), fluid mechanics (Deng et al., 2023), and others, using a sparse sensor network proves to be the most practical and effective solution. In meteorology and oceanography, variables such as atmospheric pressure, temperature, salinity/humidity, and wind/current velocity must be reconstructed from sparsely sampled observations. Currently, two distinct approaches are used to reconstruct full fields from sparse observations. Traditional physics model-based approaches are based on partial differential equations (PDEs). These approaches draw upon theoretical techniques to derive PDEs rooted in conservation laws and fundamental physical principles (Hughes, 2012). Yet, in complex systems such as weather (Brunton et al., 2016) and epidemiology (Massucci et al., 2016), deriving comprehensive models that are both sufficiently accurate and computationally efficient remains elusive.


Neural Structure Fields with Application to Crystal Structure Autoencoders

Chiba, Naoya, Suzuki, Yuta, Taniai, Tatsunori, Igarashi, Ryo, Ushiku, Yoshitaka, Saito, Kotaro, Ono, Kanta

arXiv.org Artificial Intelligence

Representing crystal structures of materials to facilitate determining them via neural networks is crucial for enabling machine-learning applications involving crystal structure estimation. Among these applications, the inverse design of materials can contribute to explore materials with desired properties without relying on luck or serendipity. We propose neural structure fields (NeSF) as an accurate and practical approach for representing crystal structures using neural networks. Inspired by the concepts of vector fields in physics and implicit neural representations in computer vision, the proposed NeSF considers a crystal structure as a continuous field rather than as a discrete set of atoms. Unlike existing grid-based discretized spatial representations, the NeSF overcomes the tradeoff between spatial resolution and computational complexity and can represent any crystal structure. We propose an autoencoder of crystal structures that can recover various crystal structures, such as those of perovskite structure materials and cuprate superconductors. Extensive quantitative results demonstrate the superior performance of the NeSF compared with the existing grid-based approach.